SpilloverDiD conley + survey + lag>0 via panel-block composition (Wave E.2 follow-up) by igerber · Pull Request #477 · igerber/diff-diff

igerber · 2026-05-21T00:25:32Z

Summary

Extends the panel-aware stratified-Conley spatial sandwich (Wave E.2 cross-sectional, PR #474) to conley_lag_cutoff > 0 by adding a within-PSU serial Bartlett HAC term (Newey-West 1987 separable form). The composition meat = meat_spatial + meat_serial has disjoint index sets, exactly matching the no-survey panel-block decomposition at diff_diff.conley._compute_conley_meat.

New sibling helper _compute_stratified_serial_bartlett_meat in diff_diff/two_stage.py (T=1 short-circuit, three-mode singleton-stratum branching, panel-wide FPC, panel-wide dense time codes, zeroed centering for singleton-active-period cells)
Orchestrator _compute_stratified_conley_meat extended with conley_lag_cutoff kwarg; spatial loop unchanged; serial helper called after when L>0
Post-resolution fail-closed gate at SpilloverDiD.fit for no-effective-PSU + lag>0 (fires AFTER _inject_cluster_as_psu so the documented cluster=<col> injection surface continues to work)
24 new tests across two follow-up test classes (aggregate + event-study)

Methodology references

Documented synthesis of:

Method names: stratified-Conley panel-block sandwich = spatial (Wave E.2) + serial Bartlett HAC
Paper / source links:
- Conley (1999) "GMM Estimation with Cross Sectional Dependence" — spatial kernel
- Newey-West (1987) "A Simple, Positive Semi-Definite HAC Covariance Matrix" — serial Bartlett kernel weights (1 - |t-s|/(L+1))
- Binder (1983) "On the Variances of Asymptotically Normal Estimators from Complex Surveys" — FPC factor form
- Gerber (2026, arXiv:2605.04124) Proposition 1 — Binder TSL composition with two-stage IF
- Wave D Gardner GMM correction (Butts 2021 §3.1 + Gardner 2022 §4) on SpilloverDiD's ring-indicator stage-2 design
Intentional deviations:
- Serial term uses per-period within-stratum centering (Binder TSL form), NOT raw scores like the no-survey panel-block reference at conley.py:949-965. Documented in REGISTRY ("Centering asymmetry vs no-survey reference"): the no-survey path assumes E[scores] = 0 so centering is a no-op; survey-weighted Binder TSL needs explicit centering or it inflates variance by twice the squared per-period stratum mean.
- FPC for the serial term uses panel-wide n_h_panel per stratum, NOT per-period n_h_t. Standalone Newey-West composition on stratified clusters — the serial sum is a panel-level construct so the cluster set is panel-wide. Spatial term keeps its existing per-period FPC unchanged.
- Requires an effective PSU (explicit survey_design.psu OR cluster=<col> injected as PSU per Wave E.1's _inject_cluster_as_psu). No-effective-PSU survey designs raise NotImplementedError per feedback_no_silent_failures (pseudo-PSU = obs-index fallback would silently zero the serial sum). Tracked in TODO.md.

Full details in docs/methodology/REGISTRY.md section "Variance (Wave E.2 follow-up - conley_lag_cutoff > 0 panel-block composition via spatial + serial Bartlett HAC)".

Validation

Tests added/updated: tests/test_spillover.py (24 new test methods across TestSpilloverDiDWaveE2FollowupConleySurveyLagCutoff and TestSpilloverDiDWaveE2FollowupConleySurveyLagCutoffEventStudy). Existing test_j0_panel_conley_lag_cutoff_rejected_under_survey (Wave E.2-era gate assertion) deleted.
Coverage: lag=0 strict bit-identity to shipped Wave E.2 (mock-spy + meat parity), raw-vs-centered hand-check, L=1 + L=2 hand-computation methodology anchors, AR(1) DGP behavioral SE inflation (rho=0.7, > 5%), cross-stratum independence, panel-wide dense time codes on unbalanced panel, singleton-adjust FPC skip, all-singleton saturation NaN-fail, singleton-active-period centering zeros, no-effective-PSU rejection, cluster-injected-PSU positive surface with SE parity vs explicit PSU, fit idempotency, drift goldens, event-study mirror on both is_staggered branches.
Backtest evidence: full _scratch/wave_e2_followup_smoke.py hand-computation anchor for the methodology composition.

Security / privacy

Confirm no secrets/PII in this PR: Yes

…e E.2 follow-up) Extends the panel-aware stratified-Conley spatial sandwich (Wave E.2 cross- sectional, PR #474) to `conley_lag_cutoff > 0` by adding a within-PSU serial Bartlett HAC term (Newey-West 1987 separable form). The composition `meat = meat_spatial + meat_serial` has disjoint index sets, exactly matching the no-survey panel-block decomposition at `diff_diff.conley._compute_conley_meat`. Methodology — documented synthesis of: - Conley (1999) spatial-HAC - Newey-West (1987) serial Bartlett kernel weights `(1 - |t-s|/(L+1))` - Binder (1983) / Gerber (2026) Prop 1 stratified TSL on Wave D Gardner GMM influence functions Serial term uses per-period within-stratum centering (Binder TSL form, matching the spatial helper); panel-wide per-stratum FPC (the serial sum is a panel-level construct, so the cluster set is panel-wide); hardcoded Bartlett serial kernel regardless of `conley_kernel` (mirrors `conley.py:951-965`); panel-wide dense time codes for lag math (matches `conley.py:940` R deviation). Supported surface — requires an effective PSU: either an explicit `survey_design.psu` OR a `cluster=<col>` argument that gets injected as the effective PSU per Wave E.1's `_inject_cluster_as_psu` routing. No-effective-PSU survey designs (weights-only / strata-only WITHOUT a cluster fallback) raise `NotImplementedError` post-resolution at `SpilloverDiD.fit` per `feedback_no_silent_failures`: the pseudo-PSU = obs-index fallback would silently zero the serial sum (each pseudo-PSU appears in exactly one period). Routing the serial loop to `conley_unit` would mix IF allocators with the spatial term and is queued as a follow-up. Code changes: - New sibling helper `_compute_stratified_serial_bartlett_meat` in `diff_diff/two_stage.py` (T=1 short-circuit, three-mode singleton-stratum branching with FPC inside the multi-PSU block to avoid divide-by-zero, panel-wide mean for `lonely_psu='adjust'`, zeroed centering for singleton-active-period cells so raw scores don't leak into the serial Bartlett cross-products under unbalanced panels) - Orchestrator `_compute_stratified_conley_meat` extended with `conley_lag_cutoff` kwarg; spatial loop unchanged; serial helper called after spatial loop when `L > 0` - Dispatch in `_compute_gmm_corrected_meat` conley branch threads `conley_lag_cutoff` through - `spillover.py:2210` Wave E.2-era `NotImplementedError` gate for lag>0 + survey deleted; replaced with post-resolution fail-closed gate that fires only when `resolved_survey_fit.psu` is None AFTER cluster injection (so the documented `cluster=<col>` injection surface continues to work) Tests — 24 new methods across two classes (`TestSpilloverDiDWaveE2FollowupConleySurveyLagCutoff` and `TestSpilloverDiDWaveE2FollowupConleySurveyLagCutoffEventStudy`): - `test_a` lag=0 strict bit-identity to shipped Wave E.2 meat - `test_a2` lag=0 does NOT invoke serial helper (mock-spy) - `test_b` lag=1 invokes serial helper exactly once (mock-spy) - `test_c0` raw-vs-centered hand-check pins Binder TSL centering - `test_c1`/`test_c2` hand-computation methodology anchors at L=1 and L=2 - `test_c3` AR(1) DGP serial inflation behavioral pin (rho=0.7, > 5%) - `test_d` single-stratum lag=1 finite output - `test_e` cross-stratum independence of serial term (partition + sum) - `test_f` singleton-adjust + lag=1 no divide-by-zero - `test_f2` all-singleton-remove + lag=1 returns zero meat - `test_g` unbalanced panel + panel-wide dense time codes (hand-computed) - `test_g2` lag > T-1 well-defined - `test_h` singleton-active-period centering zeros (sparse-period regression) - `test_j` no-survey panel-block conley unchanged after gate relaxation - `test_k` replicate-weight rejection still fires - `test_l` cluster + lag=1 + survey warn-and-use-PSU - `test_m` fit-idempotency under lag=1 + survey - `test_n`/`test_n2` no-effective-PSU survey + lag>0 raises NotImplementedError - `test_n3` cluster-injected effective-PSU surface fits + matches explicit PSU - `test_r` drift goldens at lag=1 vs lag=0 (ATT invariant, SE differs) - `test_o`/`test_p`/`test_r` event-study mirror (both is_staggered branches) Existing `test_j0_panel_conley_lag_cutoff_rejected_under_survey` (Wave E.2-era gate-assertion) deleted. Docs: - REGISTRY `Variance (Wave E.2 follow-up)` subsection with documented- synthesis framing + cross-references + effective-PSU restriction - `spillover.rst` Wave E.2 follow-up stanza - CHANGELOG `[Unreleased]` bullet - `llms.txt` + `README.md` catalog entries updated - `references.rst` adds Newey-West (1987) - TODO row deleted (old deferral); new row added for the no-effective-PSU follow-up tail Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-21T00:34:48Z

Overall Assessment

✅ Looks good

Executive Summary

No unmitigated P0/P1 issues found in the new SpilloverDiD(vcov_type="conley", conley_lag_cutoff > 0, survey_design=...) path.
The core methodology choices for the new panel-block survey variance path are documented in docs/methodology/REGISTRY.md, and the unsupported no-effective-PSU surface is fail-closed and properly tracked in TODO.md.
P2: the new survey panel-block meat path drops the Conley non-PSD warning surface that the registry says should exist on GMM-corrected Conley paths.
P3: public docs still advertise the new support too broadly; they omit the effective-PSU restriction that SpilloverDiD.fit now enforces.
P3: some reduction/test wording overstates what is actually implemented or pinned.

Methodology

Severity: P2. docs/methodology/REGISTRY.md:L3229-L3231 says Conley non-PSD warnings apply on GMM-corrected paths, and diff_diff/conley.py:L972-L980 implements that on the shared no-survey Conley path. The new survey panel-block path composes meat = meat_spatial + meat_serial in diff_diff/two_stage.py:L806-L841 and computes the serial term in diff_diff/two_stage.py:L844-L1113, but never re-runs a final finite/eigenvalue check on the combined survey meat before SpilloverDiD.fit turns diagonals into SEs via sqrt(max(vcov[ii], 0)) at diff_diff/spillover.py:L3292-L3316. Impact: an indefinite survey panel-block vcov can now reach users without the warning the methodology registry promises. Concrete fix: after adding the serial term, apply the same non-finite guard and negative-eigenvalue warning that _compute_conley_meat applies, but on the final combined survey meat matrix.

Code Quality

No findings.

Performance

No findings.

Maintainability

Severity: P3. The new serial Bartlett implementation in diff_diff/two_stage.py:L1099-L1107 duplicates the panel time-kernel logic already implemented in diff_diff/conley.py:L949-L965. Impact: survey and no-survey Conley paths can drift on diagnostics and kernel behavior; the missing PSD warning above is already one example. Concrete fix: factor out the shared panel serial-kernel/post-check logic, or centralize final meat validation in one helper.

Tech Debt

Severity: P3-informational. The remaining no-effective-PSU gap is fail-closed in diff_diff/spillover.py:L3089-L3131 and explicitly tracked in TODO.md:L142. Impact: this does not block the PR; the unsupported surface is documented and does not silently return wrong numbers. Concrete fix: none required in this PR.

Security

No findings.

Documentation/Tests

Severity: P3. SpilloverDiD.fit now requires an effective PSU for survey_design + vcov_type="conley" + conley_lag_cutoff > 0 (diff_diff/spillover.py:L3089-L3131), and the registry documents that restriction at docs/methodology/REGISTRY.md:L3299-L3303, but the public docs still advertise the feature more broadly in README.md:L109, diff_diff/guides/llms.txt:L61, and docs/api/spillover.rst:L329-L396 without that caveat. Impact: weights-only / strata-only survey users can hit an unexpected NotImplementedError despite the public docs saying the feature is supported. Concrete fix: add the effective-PSU restriction everywhere the new support is announced.
Severity: P3. The methodology/test wording overstates the implemented reduction and regression surface: the code uses centered PSU scores for both survey terms (diff_diff/survey.py:L2031-L2043, diff_diff/two_stage.py:L1005-L1110), but the new docs/changelog still describe single-stratum cases as “plain” Conley/Newey-West reductions (docs/methodology/REGISTRY.md:L3292-L3293, docs/api/spillover.rst:L370-L374, CHANGELOG.md:L13). Separately, tests/test_spillover.py:L6483-L6529 says it pins the “full meat matrix” at lag=0, but it only asserts att and se. Impact: the documented contract is harder to audit than the implementation. Concrete fix: reword those reductions to centered PSU totals plus the survey factor, and tone down the test/changelog language unless the meat matrix itself is exposed and asserted.

Execution note: this was a static diff review; I could not run the added tests in the provided environment because pytest and numpy were unavailable.

P2 (Methodology — missing PSD/finite warning on combined survey meat): mirror `_compute_conley_meat`'s finite + negative-eigenvalue guard on the combined `meat = meat_spatial + meat_serial` returned by the survey panel-block orchestrator. Both the radial 1-D Bartlett spatial kernel AND the Newey-West Bartlett serial kernel are practitioner specializations that are NOT formally PSD-guaranteed; adding two non-PSD-guaranteed terms can produce a more indefinite combined meat, so the diagnostic surface matters more on the panel-block path than the no-survey baseline. Guard fires after the saturation NaN-fail check (so genuinely-saturated meats NaN-propagate without spurious warning). P3 (Documentation — public docs missing effective-PSU restriction): README.md, diff_diff/guides/llms.txt, and docs/api/spillover.rst now mention the effective-PSU requirement for `conley_lag_cutoff > 0` (weights-only / strata-only without cluster fallback raises NotImplementedError). REGISTRY's Restrictions list already had the caveat; the public-surface docs are now consistent. P3 (Wording — single-stratum reduction overclaim): REGISTRY and CHANGELOG single-stratum reduction descriptions clarified to "Conley sandwich on within-stratum-CENTERED PSU totals" (NOT plain). The Binder TSL centering is retained at H=1 — under survey weights the per-period stratum mean is always subtracted from PSU scores before the kernel application, even when the stratum is the entire sample. The cross-sectional Wave E.2 reduction wording was already correct (centered). P3 (Documentation — test_a `full meat matrix` overclaim): tightened test_a docstring to accurately describe what is pinned (ATT AND scalar SE bit-identity via assert_array_equal), with an explicit note that full meat-matrix equality is implied — not directly asserted — because the meat matrix is not exposed on `SpilloverDiDResults`. P3 (Tech debt — code duplication, DEFERRED): the serial Bartlett kernel logic is duplicated between two_stage.py (survey path) and conley.py (no-survey path). Factoring out a shared kernel helper + shared PSD/finite guard is cosmetic and out of scope for this PR. Added a TODO.md row to track the refactor follow-up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-21T10:31:40Z

🔁 AI review rerun (requested by @igerber)

Head SHA: fbdfe5598cb4449a247d956c5feecef531692044

Overall Assessment

✅ Looks good

Executive Summary

Affected method: SpilloverDiD’s Wave E.2 follow-up survey-Conley variance path for vcov_type="conley", conley_lag_cutoff > 0.
Cross-checking the changed implementation against docs/methodology/REGISTRY.md and the local Conley/Newey-West/Binder/Gerber/Gardner references shows the core methodology is aligned: centered serial scores, panel-wide FPC on the serial term, post-injection effective-PSU gating, and the spatial+serial panel-block composition all match the documented contract.
The prior P2 finding on the missing combined-meat finite/PSD warning surface is resolved by the final guard in diff_diff/two_stage.py:L841-L869.
The prior public-doc caveat gap around the effective-PSU requirement is resolved in README.md:L109-L109, diff_diff/guides/llms.txt:L61-L61, and docs/api/spillover.rst:L329-L405.
Remaining issues are minor documentation/test-wording drift only; I found no unmitigated P0/P1 defects in the diff.
Execution note: static review only; I could not run pytest in this environment because pytest is not installed.

Methodology

No findings. The new survey lag>0 path in diff_diff/spillover.py:L3089-L3131 and diff_diff/two_stage.py:L549-L1143 is consistent with the registry contract at docs/methodology/REGISTRY.md:L3279-L3305. The load-bearing deviations from the no-survey reference are explicitly documented Notes, so they are not defects.

Code Quality

No findings.

Performance

No findings.

Maintainability

Severity: P3-informational. Impact: the serial Bartlett kernel logic and final meat-validation logic are still duplicated between diff_diff/two_stage.py:L841-L869, diff_diff/two_stage.py:L1097-L1143, and diff_diff/conley.py:L949-L990, so the survey and no-survey panel-block paths can still drift. Concrete fix: none required for approval; this is now explicitly tracked in TODO.md:L142-L142.

Tech Debt

Severity: P3-informational. Impact: the lag>0 no-effective-PSU survey surface remains unsupported, but the PR now fail-closes it in diff_diff/spillover.py:L3089-L3131 and tracks it in TODO.md:L143-L143, so there is no silent correctness bug. Concrete fix: none required for approval; keep the current NotImplementedError until a derived no-effective-PSU serial allocator exists.

Security

No findings.

Documentation/Tests

Severity: P3. Impact: a few surfaces still overstate or underspecify the exact shipped contract. docs/api/spillover.rst:L377-L379 still describes the H=1 lag>0 reduction as plain Newey-West on PSU score sequences, while the implemented and registry-pinned contract is centered PSU totals (docs/methodology/REGISTRY.md:L3289-L3293, diff_diff/two_stage.py:L1035-L1058, diff_diff/two_stage.py:L1124-L1140). diff_diff/two_stage.py:L135-L145 also says the path “requires survey_design.psu set,” even though cluster=<col> injection is a supported effective-PSU route. Separately, CHANGELOG.md:L13-L13 and tests/test_spillover.py:L6452-L6497 still describe a “full meat matrix” regression pin even though the regression only checks ATT and scalar SE. Concrete fix: align the API page and in-code docstring with the centered-score/effective-PSU contract in REGISTRY.md, and reword the changelog/test text to say ATT/SE regression pin unless the meat matrix is actually exposed and asserted.

CI Codex R2 verdict: ✅ Looks good (0 P0/P1/P2). One actionable P3 (docs wording drift across 4 surfaces). Fixes: - docs/api/spillover.rst H=1 lag>0 reduction: "plain Newey-West Bartlett HAC on per-PSU score sequences" -> "Newey-West Bartlett HAC on per-period within-stratum-CENTERED per-PSU score sequences (NOT raw — Binder TSL centering is retained at H=1)". Now matches REGISTRY which was already updated in the previous push. - diff_diff/two_stage.py _compute_gmm_corrected_meat docstring: "Requires survey_design.psu set" -> "Requires an effective PSU — either explicit survey_design.psu OR cluster=<col> injected as the effective PSU per Wave E.1's _inject_cluster_as_psu routing". Aligns the inline docstring with the actual post-resolution gate at spillover.py. - CHANGELOG.md: removed "full meat matrix" overclaim. The lag=0 regression pin uses assert_array_equal on user-visible ATT + scalar SE (the meat matrix is not exposed on SpilloverDiDResults); the test_a2 mock-spy independently asserts the serial helper isn't invoked at lag=0. - tests/test_spillover.py class docstring: same "full meat matrix" wording trimmed to "ATT and scalar SE bit-identity (test_a) plus mock-spy that the serial helper isn't invoked (test_a2)". P3 informational items NOT addressed (out of scope for this PR per the recurring codex assessment): - Serial Bartlett kernel duplication between two_stage.py + conley.py (TODO.md row added in previous push; refactor would touch the no-survey panel-block path) - no-effective-PSU survey + lag>0 fail-closed restriction (TODO.md row exists; methodology derivation pending) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-21T10:45:55Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 3b5b44435a8fe02fa88779b0fe88576c2a1ea67f

Overall Assessment

✅ Looks good

Executive Summary

Affected method: SpilloverDiD’s survey-Conley variance path for vcov_type="conley", conley_lag_cutoff > 0.
I cross-checked the new panel-block survey path in diff_diff/two_stage.py:L553-L1147 and the post-resolution gating in diff_diff/spillover.py:L3089-L3131 against docs/methodology/REGISTRY.md:L3279-L3305, the no-survey reference diff_diff/conley.py:L925-L990, and the stratified survey helper diff_diff/survey.py:L1987-L2052.
The load-bearing departures from the no-survey path are documented in the registry: centered serial scores, panel-wide serial FPC, and the effective-PSU requirement. I found no unmitigated P0/P1 methodology defects.
The prior public-doc caveat around the effective-PSU requirement is now resolved in README.md:L109-L109, diff_diff/guides/llms.txt:L61-L61, docs/api/spillover.rst:L329-L410, and docs/references.rst:L205-L207.
Remaining issues are P3 only: one lingering docs/tests overstatement about the lag=0 guarantee, plus tracked maintainability/tech-debt rows in TODO.md:L142-L143.
Static review only; I could not run pytest because the available Python interpreter in this environment does not have pytest or numpy.

Methodology

No findings. The new spatial+serial decomposition, effective-PSU fail-closed gate, centered serial scores, and panel-wide serial FPC all match the documented Wave E.2 follow-up contract in docs/methodology/REGISTRY.md:L3279-L3305; the documented deviations from the no-survey reference are explicitly labeled Notes/Restrictions, so they are not defects.

Code Quality

No findings.

Performance

No findings.

Maintainability

Severity: P3-informational. Impact: the Bartlett serial-kernel construction and post-meat diagnostic logic are still duplicated between diff_diff/two_stage.py:L845-L873, diff_diff/two_stage.py:L878-L1147, and diff_diff/conley.py:L949-L990, so the survey and no-survey panel-block paths can drift. Concrete fix: factor out a shared serial-kernel helper and shared finite/PSD guard, as already tracked in TODO.md:L142-L142.

Tech Debt

Severity: P3-informational. Impact: vcov_type="conley" + conley_lag_cutoff > 0 + survey_design= still deliberately rejects no-effective-PSU designs at diff_diff/spillover.py:L3089-L3131; this is fail-closed and explicitly tracked, so it is not a blocker. Concrete fix: none required for approval; keep the current NotImplementedError until a derived no-effective-PSU allocator exists, per TODO.md:L143-L143.

Security

No findings.

Documentation/Tests

Severity: P3. Impact: the lag=0 regression narrative still overstates what is guaranteed. docs/api/spillover.rst:L372-L374 and docs/methodology/REGISTRY.md:L3290-L3291 say the orchestrator “early-returns,” and tests/test_spillover.py:L6486-L6498 still implies full meat-matrix equality, but the code only skips the serial helper and still runs the new combined-meat diagnostic guard at diff_diff/two_stage.py:L819-L873; ATT/SE are pinned, not the full meat matrix. Concrete fix: reword those docs/tests to “serial helper is not invoked; lag=0 ATT/SE regression is pinned,” or implement a true early return plus an exposed/asserted meat regression if that stronger claim is intended.
No further findings. The previously noted effective-PSU documentation gap appears resolved at README.md:L109-L109, diff_diff/guides/llms.txt:L61-L61, docs/api/spillover.rst:L329-L410, and docs/references.rst:L205-L207.

…rate CI Codex R3 verdict: ✅ Looks good (0 P0/P1/P2). One actionable P3 (lag=0 narrative wording drift across 4 surfaces). The "orchestrator early-returns" phrasing was technically inaccurate: at lag=0 the orchestrator does NOT early-return. It still runs the spatial per-period loop, the saturation NaN-fail guard, and the new combined-meat PSD/finite guard. The lag=0 guarantee is narrower: the serial helper is not INVOKED, so `meat_serial` contributes nothing to the combined meat. Fixes: - docs/api/spillover.rst H=1 lag>0 reduction note + lag=0 reduction line: "orchestrator early-returns before invoking the serial helper" -> "orchestrator skips the serial helper invocation when L = 0 so meat_serial does not contribute; the test_a2 mock-spy independently asserts the helper isn't invoked". - docs/methodology/REGISTRY.md lag=0 reduction note: clarify that the spatial loop + saturation guard + new PSD/finite guard still run; the guarantee is on user-visible ATT + scalar SE, not on the meat matrix. - tests/test_spillover.py test_a + test_a2 docstrings: reword "early-return" to "skips the serial helper invocation"; explicitly note that full meat-matrix equality is NOT asserted (only ATT + scalar SE). - CHANGELOG.md: mirrored wording fix. P3 informational items NOT addressed (both TODO-tracked): - Serial Bartlett kernel duplication between two_stage.py + conley.py - no-effective-PSU survey + lag>0 fail-closed restriction Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-21T10:57:34Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 82dda96f8b1a1bffcd5cc7174297f319b3ef5d8d

Overall Assessment

✅ Looks good

Executive Summary

Affected method: SpilloverDiD’s survey-Conley variance path for vcov_type="conley", conley_lag_cutoff > 0, implemented in diff_diff/two_stage.py:553-1147 with the post-resolution gate in diff_diff/spillover.py:3089-3131.
I cross-checked the new spatial+serial composition against the Methodology Registry at docs/methodology/REGISTRY.md:3279-3305, the no-survey panel-block reference at diff_diff/conley.py:925-990, and the existing survey spatial helper at diff_diff/survey.py:1987-2052.
The load-bearing departures from the no-survey path are documented in the registry: serial centering, panel-wide serial FPC, and the effective-PSU requirement. I found no unmitigated P0/P1 methodology defects.
The prior lag=0 wording/test overstatement appears resolved. The docs now describe skipping the serial helper rather than an early return, and the tests pin ATT/SE plus helper non-invocation rather than claiming exposed full-meat equality. See docs/api/spillover.rst:372-379, docs/methodology/REGISTRY.md:3289-3294, and tests/test_spillover.py:6484-6559.
Remaining issues are P3 only: duplicated serial-kernel/diagnostic logic already tracked in TODO.md:142-142, and the intentional no-effective-PSU restriction already tracked in TODO.md:143-143.
Static review only; I could not run the tests because the local Python environment here does not have pytest or numpy.

Methodology

No findings. Affected method: survey-weighted panel-block Conley for SpilloverDiD. The new meat_spatial + meat_serial construction, the post-resolution fail-closed gate for no-effective-PSU designs, the centered serial scores, and the panel-wide serial FPC all match the documented Wave E.2 follow-up contract in docs/methodology/REGISTRY.md:3279-3305 and the code paths in diff_diff/two_stage.py:553-1147. The deviations from the no-survey reference at diff_diff/conley.py:925-990 are explicitly labeled Notes/Restrictions, so they are not defects under the review rubric.

Code Quality

No findings.

Performance

No findings.

Maintainability

Severity: P3-informational. Impact: the Bartlett serial-kernel construction and the post-meat finite/PSD diagnostics are now duplicated between the survey panel-block path and the no-survey panel-block path, which creates a drift risk if one side changes later. Concrete fix: factor a shared serial-kernel helper and shared post-meat diagnostic helper, as already tracked in TODO.md:142-142. References: diff_diff/two_stage.py:845-873, diff_diff/two_stage.py:1105-1144, diff_diff/conley.py:928-990, TODO.md:142-142.

Tech Debt

Severity: P3-informational. Impact: vcov_type="conley" + conley_lag_cutoff > 0 + survey_design= still intentionally rejects no-effective-PSU designs after survey resolution/injection. This is fail-closed, documented, and tracked, so it is not a blocker. Concrete fix: none required for approval; keep the current NotImplementedError until a no-effective-PSU serial derivation exists. References: diff_diff/spillover.py:3089-3131, docs/methodology/REGISTRY.md:3299-3305, TODO.md:143-143.

Security

No findings.

Documentation/Tests

No findings. The prior re-review P3 on lag=0 wording/testing looks addressed: docs now describe the actual helper-skip behavior, and tests pin the user-visible contract without overstating an early return. References: docs/api/spillover.rst:372-379, docs/methodology/REGISTRY.md:3289-3294, tests/test_spillover.py:6484-6559.
Static review only; I could not execute the added tests because the local Python environment lacks pytest and numpy.

Release notes consolidate 8 PRs since 3.4.0 (2026-05-19): Public-surface variance lifts: - SpilloverDiD survey_design on HC1/CR1 via Binder TSL (Wave E.1, igerber#468) - SpilloverDiD vcov_type=conley + survey_design via stratified-Conley on PSU totals (Wave E.2, igerber#474) + lag_cutoff>0 follow-up (igerber#477) - SunAbraham vcov_type ∈ {classical, hc1, hc2, hc2_bm} (Phase 1b 1/8, igerber#472) - WLS-CR2 Bell-McCaffrey gates lifted via clubSandwich port (igerber#475) Methodology-review-tracker promotions (mostly docs/tests): - PreTrendsPower R pretrends parity goldens (PR-C, igerber#471) - HAD methodology-review-tracker promotion (igerber#473) - ContinuousDiD methodology-review-tracker promotion (igerber#476) All changes additive; bit-equal defaults preserved across the affected estimators. No new estimators (patch-level per semver convention). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber added the ready-for-ci Triggers CI test workflows label May 21, 2026

igerber merged commit 88a4362 into main May 21, 2026
33 of 34 checks passed

igerber deleted the spillover-conley-wave-e2-followup-lag branch May 21, 2026 12:19

igerber mentioned this pull request May 21, 2026

Release 3.4.1: SpilloverDiD survey + Conley lifts, SunAbraham vcov_type, WLS-CR2 BM, methodology-tracker promotions #480

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SpilloverDiD conley + survey + lag>0 via panel-block composition (Wave E.2 follow-up)#477

SpilloverDiD conley + survey + lag>0 via panel-block composition (Wave E.2 follow-up)#477
igerber merged 4 commits into
mainfrom
spillover-conley-wave-e2-followup-lag

igerber commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 21, 2026

Summary

Methodology references

Validation

Security / privacy

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant